Internal Dependence Based F0 Model for Mandarin Tts System

نویسندگان

  • Jianhua Tao
  • Jian Yu
  • Wanzhi Zhang
چکیده

The paper presents a new pitch generation model based on internal dependence of pitch contour. This model pays more attention to the impact of adjacent syllables’ pitch contours on the current one. A new definition of concatenation cost is presented to measure the naturalness of pitch contours between every two adjacent syllables. Based on this definition, the model concentrates on how to remove unnatural pitch contours across concatenation places which are always the most unstable parts in the synthesized speech. This model can generate natural, fluent pitch contours and was proved to be able to catches the essential nature of pitch contour.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Syllable HMM based Mandarin TTS and comparison with concatenative TTS

This paper introduces a Syllable HMM based Mandarin TTS system. 10-state left-to-right HMMs are used to model each syllable. We leverage the corpus and the front end of a concatenative TTS system to build the Syllable HMM based TTS system. Furthermore, we utilize the unique consonant/vowel structure of Mandarin syllable to improve the voiced/unvoiced decision of HMM states. Evaluation results s...

متن کامل

The Toshiba Mandarin TTS System for the Blizzard Challenge 2008

This paper describes the Toshiba Mandarin Text-to-Speech (TTS) system that was submitted to the Blizzard Challenge 2008. The front-end of the system uses machine-learning approaches such as generalized linear models (GLM) and Quantification Method Type 1 (QMT1) to predict pause, duration and F0 contour. According to the predicted prosody information, the back-end of the system uses Toshiba’s ow...

متن کامل

Improved generation of prosodic features in HMM-based Mandarin speech synthesis

The HMM-based Text-to-Speech System can produce high quality synthetic speech with flexible modeling of spectral and prosodic parameters. However, the prosodic features, like F0 and duration trajectories, generated by HMM-based speech synthesis are often excessively smoothed and lack prosodic variance. In HMM-based TTS durations are typically modeled statistically using state duration probabili...

متن کامل

Towards the automatic extraction of fujisaki model parameters for Mandarin

The generation of naturally-sounding F0 contours in TTS enhances the intelligibility and perceived naturalness of synthetic speech. In earlier works the first author developed a linguistically motivated model of German intonation based on the quantitative Fujisaki model of the production process of F0, and an automatic procedure for extracting the parameters from the F0 contour which, however, ...

متن کامل

Generating natural F0 trajectory with additive trees

In HMM-based TTS, while the segmental quality of synthesized speech is quite acceptable, intonation, especially at the sentence level, tends to be somewhat bland. The maximum likelihood (ML) criterion used in HMM training and parameter trajectory generation is partially responsible for the blandness. Additionally, the F0 trajectory thus generated has a smaller dynamic range than that of natural...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006